46 research outputs found
Space-Partitioning RANSAC
A new algorithm is proposed to accelerate RANSAC model quality calculations.
The method is based on partitioning the joint correspondence space, e.g., 2D-2D
point correspondences, into a pair of regular grids. The grid cells are mapped
by minimal sample models, estimated within RANSAC, to reject correspondences
that are inconsistent with the model parameters early. The proposed technique
is general. It works with arbitrary transformations even if a point is mapped
to a point set, e.g., as a fundamental matrix maps to epipolar lines. The
method is tested on thousands of image pairs from publicly available datasets
on fundamental and essential matrix, homography and radially distorted
homography estimation. On average, it reduces the RANSAC run-time by 41% with
provably no deterioration in the accuracy. It can be straightforwardly plugged
into state-of-the-art RANSAC frameworks, e.g. VSAC
Deep MAGSAC++
We propose Deep MAGSAC++ combining the advantages of traditional and deep
robust estimators. We introduce a novel loss function that exploits the
orientation and scale from partially affine covariant features, e.g., SIFT, in
a geometrically justifiable manner. The new loss helps in learning higher-order
information about the underlying scene geometry. Moreover, we propose a new
sampler for RANSAC that always selects the sample with the highest probability
of consisting only of inliers. After every unsuccessful iteration, the
probabilities are updated in a principled way via a Bayesian approach. The
prediction of the deep network is exploited as prior inside the sampler.
Benefiting from the new loss, the proposed sampler and a number of technical
advancements, Deep MAGSAC++ is superior to the state-of-the-art both in terms
of accuracy and run-time on thousands of image pairs from publicly available
real-world datasets for essential and fundamental matrix estimation
DGC-GNN: Descriptor-free Geometric-Color Graph Neural Network for 2D-3D Matching
Direct matching of 2D keypoints in an input image to a 3D point cloud of the
scene without requiring visual descriptors has garnered increased interest due
to its lower memory requirements, inherent privacy preservation, and reduced
need for expensive 3D model maintenance compared to visual descriptor-based
methods. However, existing algorithms often compromise on performance,
resulting in a significant deterioration compared to their descriptor-based
counterparts. In this paper, we introduce DGC-GNN, a novel algorithm that
employs a global-to-local Graph Neural Network (GNN) that progressively
exploits geometric and color cues to represent keypoints, thereby improving
matching robustness. Our global-to-local procedure encodes both Euclidean and
angular relations at a coarse level, forming the geometric embedding to guide
the local point matching. We evaluate DGC-GNN on both indoor and outdoor
datasets, demonstrating that it not only doubles the accuracy of the
state-of-the-art descriptor-free algorithm but, also, substantially narrows the
performance gap between descriptor-based and descriptor-free methods. The code
and trained models will be made publicly available
Volumetric Semantically Consistent 3D Panoptic Mapping
We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at
generating comprehensive, accurate, and efficient semantic 3D maps suitable for
autonomous agents in unstructured environments. The proposed approach is based
on a Voxel-TSDF representation used in recent algorithms. It introduces novel
ways of integrating semantic prediction confidence during mapping, producing
semantic and instance-consistent 3D regions. Further improvements are achieved
by graph optimization-based semantic labeling and instance refinement. The
proposed method achieves accuracy superior to the state of the art on public
large-scale datasets, improving on a number of widely used metrics. We also
highlight a downfall in the evaluation of recent studies: using the ground
truth trajectory as input instead of a SLAM-estimated one substantially affects
the accuracy, creating a large gap between the reported results and the actual
performance on real-world data.Comment: 8 pages, 2 figure
Fully Differentiable RANSAC
We propose the fully differentiable -RANSAC.It predicts the inlier
probabilities of the input data points, exploits the predictions in a guided
sampler, and estimates the model parameters (e.g., fundamental matrix) and its
quality while propagating the gradients through the entire procedure. The
random sampler in -RANSAC is based on a clever re-parametrization
strategy, i.e.\ the Gumbel Softmax sampler, that allows propagating the
gradients directly into the subsequent differentiable minimal solver. The model
quality function marginalizes over the scores from all models estimated within
-RANSAC to guide the network learning accurate and useful
probabilities.-RANSAC is the first to unlock the end-to-end training of
geometric estimation pipelines, containing feature detection, matching and
RANSAC-like randomized robust estimation. As a proof of its potential, we train
-RANSAC together with LoFTR, i.e. a recent detector-free feature
matcher, to find reliable correspondences in an end-to-end manner. We test
-RANSAC on a number of real-world datasets on fundamental and essential
matrix estimation. It is superior to the state-of-the-art in terms of accuracy
while being among the fastest methods. The code and trained models will be made
public
Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature
Point cloud registration has seen recent success with several learning-based
methods that focus on correspondence matching and, as such, optimize only for
this objective. Following the learning step of correspondence matching, they
evaluate the estimated rigid transformation with a RANSAC-like framework. While
it is an indispensable component of these methods, it prevents a fully
end-to-end training, leaving the objective to minimize the pose error
nonserved. We present a novel solution, Q-REG, which utilizes rich geometric
information to estimate the rigid pose from a single correspondence. Q-REG
allows to formalize the robust estimation as an exhaustive search, hence
enabling end-to-end training that optimizes over both objectives of
correspondence matching and rigid pose estimation. We demonstrate in the
experiments that Q-REG is agnostic to the correspondence matching method and
provides consistent improvement both when used only in inference and in
end-to-end training. It sets a new state-of-the-art on the 3DMatch, KITTI, and
ModelNet benchmarks
Efficient solutions to the relative pose of three calibrated cameras from four points using virtual correspondences
We study the challenging problem of estimating the relative pose of three
calibrated cameras. We propose two novel solutions to the notoriously difficult
configuration of four points in three views, known as the 4p3v problem. Our
solutions are based on the simple idea of generating one additional virtual
point correspondence in two views by using the information from the locations
of the four input correspondences in the three views. For the first solver, we
train a network to predict this point correspondence. The second solver uses a
much simpler and more efficient strategy based on the mean points of three
corresponding input points. The new solvers are efficient and easy to implement
since they are based on the existing efficient minimal solvers, i.e., the
well-known 5-point relative pose and the P3P solvers. The solvers achieve
state-of-the-art results on real data. The idea of solving minimal problems
using virtual correspondences is general and can be applied to other problems,
e.g., the 5-point relative pose problem. In this way, minimal problems can be
solved using simpler non-minimal solvers or even using sub-minimal samples
inside RANSAC.
In addition, we compare different variants of 4p3v solvers with the baseline
solver for the minimal configuration consisting of three triplets of points and
two points visible in two views. We discuss which configuration of points is
potentially the most practical in real applications